Data Propagation in a Distributed File System
نویسندگان
چکیده
Distributed file systems benefit greatly from optimistic replication that allows clients to update any replica at any site. However, such systems face a new dilemma: Once data is updated at a replica site, when should it be shipped to others? In conventional file system workloads, most data is read by its writer, so it needs to be shipped only for administrative reasons. Unfortunately, shipping on demand substantially penalizes those who do share, while shipping aggressively places undue load on the system, limiting scalability. This paper explores a mechanism to predict and support sharing in a wide area file system. This mechanism uses the observation that updates that invalidate cached objects elsewhere are likely to be shared, and they should be propagated. The mechanism is evaluated on its precision, recall, and F-measure over traces capturing live file system uses. It reduces data shipped by orders of magnitude compared to aggressive schemes, while reducing the performance penalty of on-demand shipment by nearly a factor of two.
منابع مشابه
Ex Vivo Comparison of File Fracture and File Deformation in Canals with Moderate Curvature: Neolix Rotary System versus Manual K-files
Background and Aim: Cleaning and shaping is one of the important steps in endodontic treatment, which has an important role in root canal treatment outcome. This study evaluated the rate of file fracture and file deformation in Neolix rotary system and K-files in shaping of the mesiobuccal canal of maxillary first molars with moderate curvature. Materials and Methods: In this ex vivo exp...
متن کاملExperience Using a Globally Shared State Abstraction to Support Distributed Applications
In this paper, we evaluate the effectiveness of basing distributed systems on a persistent globally shared address space abstraction, as implemented by Khazana. Khazana provides shared state management services to distributed application developers, including consistent caching, automated replication and migration of data, location management, access control, and (limited) fault tolerance. We r...
متن کاملE2DR: Energy Efficient Data Replication in Data Grid
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...
متن کاملOperation-based Update Propagation in a Mobile File System
In this paper we describe a technique called operationbased update propagation for efficiently transmitting updates to large files that have been modified on a weakly connected client of a distributed file system. In this technique, modifications are captured above the file-system layer at the client, shipped to a surrogate client that is strongly connected to a server, re-executed at the surro...
متن کاملUW CENTER FOR PATTERN ANALYSIS AND MACHINE INTELLIGENCE GRADUATE SEMINAR SERIES Distributed Random Finite Set Theoretic Soft/Hard Data Fusion: Target Tracking Application
The development of data fusion systems capable of incorporating soft humangenerated data into the fusion process is an emerging trend in the fusion community, motivated mainly by asymmetric warfare situations where the observational opportunities for traditional hard sensors are restricted. This paper describes an extension of our prototype soft/hard data fusion system, based on RFS theory, fro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004